Working Memory Capacity of ChatGPT: An Empirical Study

Authors

  • Dongyu Gong University of Oxford Yale University
  • Xingchen Wan University of Oxford
  • Dingmin Wang University of Oxford

DOI:

https://doi.org/10.1609/aaai.v38i9.28868

Keywords:

HAI: Other Foundations of Human Computation & AI, CMS: Adaptive Behavior, CMS: Other Foundations of Cognitive Modeling & Systems, CMS: Simulating Human Behavior, NLP: (Large) Language Models, NLP: Conversational AI/Dialog Systems, NLP: Interpretability, Analysis, and Evaluation of NLP Models

Abstract

Working memory is a critical aspect of both human intelligence and artificial intelligence, serving as a workspace for the temporary storage and manipulation of information. In this paper, we systematically assess the working memory capacity of ChatGPT, a large language model developed by OpenAI, by examining its performance in verbal and spatial n-back tasks under various conditions. Our experiments reveal that ChatGPT has a working memory capacity limit strikingly similar to that of humans. Furthermore, we investigate the impact of different instruction strategies on ChatGPT's performance and observe that the fundamental patterns of a capacity limit persist. From our empirical findings, we propose that n-back tasks may serve as tools for benchmarking the working memory capacity of large language models and hold potential for informing future efforts aimed at enhancing AI working memory.

Published

2024-03-24

How to Cite

Gong, D., Wan, X., & Wang, D. (2024). Working Memory Capacity of ChatGPT: An Empirical Study. Proceedings of the AAAI Conference on Artificial Intelligence, 38(9), 10048-10056. https://doi.org/10.1609/aaai.v38i9.28868

Issue

Section

AAAI Technical Track on Humans and AI